Reinforcement Learning for Guiding the E Theorem Prover
نویسندگان
چکیده
Automated Theorem Proving (ATP) systems search for aproof in a rapidly growing space of possibilities. Heuristicshave profound impact on search, and ATP makeheavy use heuristics. This work uses reinforcement learn-ing to learn metaheuristic that decides which heuristic useat each step proof the E system. Proximalpolicy optimization is used dynamically select heuristicfrom fixed set, based current state E. The approachis evaluated its ability reduce number inferencesteps successful searches, as an indicator in-telligent search.
منابع مشابه
Guiding a Theorem Prover with Soft Constraints
Attempts to use finite models to guide the search for proofs by resolution and the like in first order logic all suffer from the need to trade off the expense of generating and maintaining models against the improvement in quality of guidance as investment in the semantic aspect of the reasoning is increased. Previous attempts to resolve this tradeoff have resulted either in poor selection of m...
متن کاملE - a brainiac theorem prover
We describe the superposition-based theorem prover E. E is a sound and complete prover for clausal first order logic with equality. Important properties of the prover include strong redundancy elimination criteria, the DISCOUNT loop proof procedure, a very flexible interface for specifying search control heuristics, and an efficient inference engine. We also discuss strength and weaknesses of t...
متن کاملExperiments with Strategy Learning for E Prover
Automated theorem provers (ATPs) consist of a number of complicated algorithms, that can be parameterized and combined together in different ways. Examples of such parameterizations are clause weighting and selection schemes, term orderings, sets of inference and reduction rules used, etc. E [8] (as some other ATPs) has a language for packaging such useful combinations of parameterizations into...
متن کاملThe Heuristic Theorem Prover: Yet Another SMT Modulo Theorem Prover
HTP is an SMT Modulo theorem prover similar to many others.[2–6, 9, 11] As input, HTP accepts problems using the SMT-LIB format[8]. As output, HTP will answer either SAT, UNSAT or UNKNOWN. Alternatively, HTP can be run in a preprocessing mode in which the output is the simplified problem in SMTLIB format. An evidence file showing the derivation in a human readable form can be produced. There is...
متن کاملGuiding Inference Through Relational Reinforcement Learning
Reasoning plays a central role in intelligent systems that operate in complex situations that involve time constraints. In this paper, we present the Adaptive Logic Interpreter, a reasoning system that acquires a controlled inference strategy adapted to the scenario at hand, using a variation on relational reinforcement learning. Employing this inference mechanism in a reactive agent architectu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... International Florida Artificial Intelligence Research Society Conference
سال: 2023
ISSN: ['2334-0762', '2334-0754']
DOI: https://doi.org/10.32473/flairs.36.133334